{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Matplotlib 数据可视化基础\n",
    "\n",
    "\n",
    "## 二.pyplot绘制图形的主要函数\n",
    "### 4.绘制柱状图\n",
    "### 5.绘制条形图\n",
    "### 6.绘制直方图\n",
    "### 7.绘制箱线图\n",
    "### 8.绘制雷达图\n",
    "\n",
    "\n",
    "## 知识目标：掌握pyplot常用绘制图形函数及其参数\n",
    "## 技能目标：掌握柱状图、条形图、直方图等常见图形的绘制\n",
    "## 重点：柱状图、条形图、直方图的绘制\n",
    "## 难点：雷达图\n",
    "========================================================="
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 二.pyplot绘制图形的主要函数\n",
    "### 4.绘制柱状图\n",
    "柱状图是一种以长方形的长度为变量表达图形的统计图，由一系列高度不等的纵向条状表示数据分布的图形。适用于比较多组数据之间的大小。\n",
    "\n",
    "语法：plt.bar(x,height,width=0.8,bottom=None,align='center')\n",
    "#x：表示x轴的数据\n",
    "#height：表示柱子的高度\n",
    "#width：表示柱子的宽度，默认为0.8\n",
    "#bottom：用来指定每个柱底部边框的y坐标\n",
    "#align：用来指定每个柱子的对齐方式，取值可以为edge,center（居中）,edge(且width>0)表示柱子左边框与x坐标对齐，即左对齐；edge(且width<0)表示右对齐"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "plt.rcParams['font.family']='SimHei'         \n",
    "plt.rcParams['axes.unicode_minus']=False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#例10：根据“烧烤店营业额”数据绘制柱状图。\n",
    "data1 = pd.read_excel(r'D:\\pyData\\烧烤店营业额.xlsx')\n",
    "plt.bar(x=data1['月份'],height=data1['营业额（万元）'])\n",
    "plt.title('2019年烧烤店营业额')\n",
    "plt.savefig('D:\\pyData\\例10')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "思考：如果每个x值对应有多组数据，需要绘制多根柱子呢？"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#例11：根据“商场营业额”数据绘制柱状图\n",
    "#生成商场营业额数据\n",
    "data2 = pd.DataFrame({'月份': [1,2,3,4,5,6,7,8,9,10,11,12],\n",
    "              '男装': [51,32,58,57,30,46,38,38,40,53,58,50],\n",
    "              '女装': [70,30,48,73,82,80,43,25,30,49,79,60],\n",
    "              '餐饮': [60,40,46,50,57,76,70,33,70,61,49,45],\n",
    "              '化妆品': [110,75,130,80,83,95,87,89,96,88,86,89],\n",
    "              '金银首饰': [143,100,89,90,78,129,100,97,108,152,96,87]})\n",
    "\n",
    "#方法1：使用plt.bar绘制\n",
    "plt.figure(figsize=(12,8))\n",
    "width=0.1\n",
    "plt.bar(x=data2['月份']-2*width,height=data2['男装'],width=0.1)\n",
    "plt.bar(x=data2['月份']-width,height=data2['女装'],width=0.1)\n",
    "plt.bar(x=data2['月份'],height=data2['餐饮'],width=0.1)\n",
    "plt.bar(x=data2['月份']+width,height=data2['化妆品'],width=0.1)\n",
    "plt.bar(x=data2['月份']+2*width,height=data2['金银首饰'],width=0.1)\n",
    "plt.xlabel('月份')\n",
    "plt.ylabel('营业额（万元）')\n",
    "plt.xticks(np.arange(1,13,1))\n",
    "plt.yticks(np.arange(10,160,10))\n",
    "plt.title('商场营业额',fontsize=16)\n",
    "plt.legend(['男装','女装','餐饮','化妆品','金银首饰'])\n",
    "plt.savefig('D:\\pyData\\例11')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#方法2：使用plot绘制\n",
    "data2.plot(x='月份',kind='bar')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5.绘制条形图\n",
    "条形图等同于柱状图的横置展示，用条形的长短表示数据的多少。\n",
    "\n",
    "plt.barh(y,width,align='center')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#例12：将例10中的数据用条形图展示出来，并添加数值。\n",
    "data1 = pd.read_excel(r'D:\\pyDa·ta\\烧烤店营业额.xlsx').sort_values(by='营业额（万元）')\n",
    "plt.barh(y=data1['月份'],width=data1['营业额（万元）'])\n",
    "plt.title('2019年烧烤店营业额')\n",
    "for y,x in enumerate(data1['营业额（万元）'].values):     #enumerate 遍历数值与其对应的下标，生成多个元组\n",
    "    plt.text(x+0.2,y-0.2,'%s'%x)    #  '%s'%x表示将x值格式化为字符串\n",
    "plt.savefig('D:\\pyData\\例12')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6.直方图\n",
    "直方图又称质量分布图，在质量管理中，直方图通过对收集到的貌似无序的数据进行处理，来反映产品质量的分布情况，判断和预测产品质量及不合格率。\n",
    "\n",
    "语法：plt.hist(x,bins=None,range=None, density=None, bottom=None, histtype='bar', align='mid', log=False, color=None, label=None, stacked=False, normed=None)\n",
    "\n",
    "#x: 数据集，最终的直方图将对数据集进行统计\n",
    "#bins: 统计的区间(直方图的个数)\n",
    "#histtype: 可选['bar', 'barstacked', 'step', 'stepfilled']之一，默认为bar，推荐使用默认配置，step使用的是梯状，stepfilled则会对梯状内部进行填充，效果与bar类似\n",
    "#align: 可选{'left', 'mid', 'right'}之一，默认为'mid'，控制柱状图的水平分布，left或者right，会有部分空白区域，推荐使用默认\n",
    "#log: bool，默认False，即y坐标轴是否选择指数刻度\n",
    "#stacked: bool，默认为False，是否为堆积状图\n",
    "\n",
    "直方图与柱状图的区别：\n",
    "#1.柱状图的高度表示数值的大小，直方图的高度表示数值出现的频数\n",
    "#2.柱状图的每根柱子是分开的，用于展示分类数据；直方图的柱子是连续的，用于展示数据型数据。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#例13：创建高斯随机分布数100个，均值为100，标准差为20，并用直方图展示其分布情况。\n",
    "arr=np.random.normal(100,20,100)   #高斯随机分布，均值和标准差\n",
    "plt.hist(arr,bins=10,histtype='step',facecolor='b',alpha=0.5)  \n",
    "plt.title('直方图')\n",
    "plt.savefig('D:\\pyData\\例13')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#例14：读取“高考成绩”数据，用直方图分别展示“英语”和个人“总分”\n",
    "scores = pd.read_excel(r'D:\\pyData\\高考成绩.xlsx')\n",
    "\n",
    "plt.figure(figsize=(10,4))\n",
    "plt.subplot(121)\n",
    "plt.hist(scores['英语'],bins=5,facecolor='b',alpha=0.75)  \n",
    "plt.xlabel('分数段')\n",
    "plt.ylabel('人数')\n",
    "plt.title('英语成绩分布')\n",
    "\n",
    "def sroces_sum(df):\n",
    "    a = df['语文']\n",
    "    b = df['数学']\n",
    "    c = df['英语']\n",
    "    d = df['综合']\n",
    "    return a+b+c+d\n",
    "scores['总分']  = sroces_sum(scores)\n",
    "plt.subplot(122)\n",
    "plt.hist(scores['总分'],bins=5,facecolor='r')  \n",
    "plt.xlabel('分数段')\n",
    "plt.ylabel('人数')\n",
    "plt.title('总分成绩分布')\n",
    "\n",
    "plt.savefig('D:\\pyData\\例14')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.绘制箱线图\n",
    "箱线图也称箱须图，是一种用作显示一组数据分散情况资料的统计图。它主要用于反映原始数据分布的特征，还可以进行多组数据分布特征的比较。\n",
    "\n",
    "语法：plt.boxplot(x, notch=None, sym=None, vert=None, whis=None, positions=None, widths=None, patch_artist=None, meanline=None, showmeans=None, showcaps=None, showbox=None, showfliers=None, boxprops=None, labels=None, flierprops=None, medianprops=None, meanprops=None, capprops=None, whiskerprops=None)\n",
    "#x：指定要绘制箱线图的数据；\n",
    "#notch：是否是凹口的形式展现箱线图，默认非凹口；\n",
    "#sym：指定异常点的形状，默认为o号显示；\n",
    "#vert：是否需要将箱线图垂直摆放，默认垂直摆放；\n",
    "#whis：指定上下须与上下四分位的距离，默认为1.5倍的四分位差；\n",
    "#positions：指定箱线图的位置，默认为[0,1,2…]；\n",
    "#widths：指定箱线图的宽度，默认为0.5；\n",
    "#patch_artist：是否填充箱体的颜色；\n",
    "#meanline：是否用线的形式表示均值，默认用点来表示；\n",
    "#showmeans：是否显示均值，默认不显示；\n",
    "#showcaps：是否显示箱线图顶端和末端的两条线，默认显示；\n",
    "#showbox：是否显示箱线图的箱体，默认显示；\n",
    "#showfliers：是否显示异常值，默认显示；\n",
    "#boxprops：设置箱体的属性，如边框色，填充色等；\n",
    "#labels：为箱线图添加标签，类似于图例的作用；\n",
    "#filerprops：设置异常值的属性，如异常点的形状、大小、填充色等；\n",
    "#medianprops：设置中位数的属性，如线的类型、粗细等；\n",
    "#meanprops：设置均值的属性，如点的大小、颜色等；\n",
    "#capprops：设置箱线图顶端和末端线条的属性，如颜色、粗细等；\n",
    "#whiskerprops：设置须的属性，如颜色、粗细、线的类型等；"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#例15：生成均值为0，标准差分别为1,2,3的高斯随机分布数各100个，并用箱线图展示，判断有无异常值\n",
    "data=[np.random.normal(0,std,100) for std in range(1,4)]\n",
    "plt.boxplot(data,notch=True,patch_artist=True)  \n",
    "plt.savefig('D:\\pyData\\例15')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#例16：判断“高考成绩”数据中语数外三科有无异常值\n",
    "arr2=[i for i in scores.iloc[:,4:7].values.T]\n",
    "plt.boxplot(arr2,notch=True,patch_artist=True)  \n",
    "plt.savefig('D:\\pyData\\例16')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.绘制雷达图\n",
    "雷达图也称作极坐标图、星图、蜘蛛网图，常用语企业经营状况的分析，便于企业管理者及时发现薄弱环节进行改进，也可以用于发现异常值。\n",
    "\n",
    "语法：plt.polar()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 例17：P215 9-9\n",
    "#某学生的课程与成绩\n",
    "courses = ['C++', 'Python', '高数', '大学英语', '软件工程',\n",
    "           '组成原理', '数字图像处理', '计算机图形学']\n",
    "scores = [80, 95, 78, 85, 45, 65, 80, 60]\n",
    "\n",
    "dataLength = len(scores)               # 数据长度\n",
    "\n",
    "# angles为等差数组，把圆周等分为dataLength份\n",
    "angles = np.linspace(0,                # 数组第一个数据\n",
    "                     2*np.pi,          # 数组最后一个数据\n",
    "                     dataLength,       # 数组中数据数量\n",
    "                     endpoint=False)   # 不包含终点\n",
    "\n",
    "scores.append(scores[0])\n",
    "angles = np.append(angles, angles[0])  # 闭合\n",
    "# 绘制雷达图\n",
    "plt.polar(angles,                      # 设置角度\n",
    "          scores,                      # 设置各角度上的数据\n",
    "          'rv--',                      # 设置颜色、线型和端点符号\n",
    "          linewidth=2)                 # 设置线宽\n",
    "\n",
    "# 设置角度网格标签\n",
    "plt.thetagrids(angles*180/np.pi,      # 求出angles中每个角度\n",
    "               courses)               # 对每个角度填充对应的标签\n",
    "\n",
    "# 填充雷达图内部\n",
    "plt.fill(angles,scores,facecolor='r',alpha=0.3)\n",
    "plt.savefig('D:\\pyData\\例17')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
